Using Predicted Outcome Stratified Sampling to Reduce the Variability in Predictive Performance of a One-Shot Train-and-Test Split for Individual Customer Predictions

نویسندگان

  • Geert Verstraeten
  • Dirk Van den Poel
چکیده

Since it is generally recognised that models evaluated on the data that was used for constructing them are overly optimistic, in predictive modeling practice, the assessment of a model’s predictive performance frequently relies on a one-shot train-and-test split between observations used for estimating a model, and those used for validating it. Previous research has indicated the usefulness of stratified sampling for reducing the variation in predictive performance in a linear regression application. In this paper, we validate the previous findings on six real-life European predictive modeling applications for marketing and credit scoring using a dichotomous outcome variable. We find confirmation for the reduction in variability using a procedure we describe as predicted outcome stratified sampling in a logistic regression model, and we find that the gain in variation reduction is – also in large data sets – almost always significant, and in certain applications markedly high.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling of Conventional and Severe Shot Peening Influence on Properties of High Carbon Steel via Artificial Neural Network

Shot peening (SP), as one of the severe plastic deformation (SPD) methods is employed for surface modification of the engineering components by improving the metallurgical and mechanical properties. Furthermore artificial neural network (ANN) has been widely used in different science and engineering problems for predicting and optimizing in the last decade. In the present study, effects of conv...

متن کامل

Performance Evaluation of Dynamic Modulus Predictive Models for Asphalt Mixtures

Dynamic modulus characterizes the viscoelastic behavior of asphalt materials and is the most important input parameter for design and rehabilitation of flexible pavements using Mechanistic–Empirical Pavement Design Guide (MEPDG). Laboratory determination of dynamic modulus is very expensive and time consuming. To overcome this challenge, several predictive models were developed to determine dyn...

متن کامل

The Predictive Factors of Job Performance in Nurses' Moral Distress

Background: Moral distress is one of the most complex ethical problems for nurses working in Intensive Care Units. Desired job performance of the nurse guarantees the quality of health care provided to patients and is an important factor in accelerating the process of treatment and recovery of patients. This study was conducted to investigate the predictive factors of job performance in nursesc...

متن کامل

Interprofessional Education: a Step towards Team Work Improvement in Cardio-Pulmonary Resuscitation

Introduction: Cardiopulmonary arrest is one of the main medical urgencies. Studies show that 20% to 30% of patients could be resuscitated via an efficient cardio-pulmonary resuscitation (CPR). Training CPR through interprofessional method could lead to improving the performance of resuscitation group. The aim of this research was to study the effects of Interprofessional training on resuscitati...

متن کامل

Predictive role of emotional intelligence and individual occupational factors on occupational stress among the midwives working in health centers and hospitals in Rasht, 2018

Background & Aim: Midwives experience high levels of stress due to the nature of their work. Some factors can play a significant role in the occupational stress experience. The aim of the study was to investigate the predictive role of emotional intelligence and individual-occupational factors on occupational stress among the midwives working in Rasht. Methods & Materials: In this descriptive,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006